\(i\)が治療群(政策に影響されるグループ)に属するとき\(D_{i}=1\)、\(i\)が統御群(政策に影響されないグループ)に属するとき\(D_{i}=0\). \(D_{i}\)は\(i\)が政策に影響されるグループに属することのインディケータ関数indicator function 、ダミー変数dummy variable。
“\(|\)”
Reads “given” or “when”. 「次が所与のとき」「次が成り立つとき」と読む
\(y_{i}|D_{i}=1\)
\(i\)が治療群に属しているときの\(y_{i}\)、\(y_{i1}\)とも書く
\(y_{i}|D_{i}=0\)
\(i\)が統御群に属しているときの\(y_{i}\)、\(y_{i0}\)とも書く
個人\(i\)の治療効果treatment effect of policy for individual \(i\):
プログラム評価での根源的問題 The fundamental problem in program evaluation
治療群に属するときの\(y_{i}\)と統御群に属するときの\(y_{i}\)を同時に観察できない We cannot observe \(y_{i}\) in the treated and in the control for the same individual \(i\) simultaneously.
\(\Leftrightarrow\)
各個人\(i\)の結果指標\(y_{i}\)のcounterfactual (CF)を観測することはできないWe cannot observe a counterfactual (CF) outcome of each individual \(i\)’s factual outcome.
\(\Leftrightarrow\)
言い換えれば、仮定なしには政策の効果を計算することはできないIn other words, we cannot compute the causal impacts of a policy for each individual \(i\) without further assumptions.
CF for \(y_{i}|D_{i}=1\): An outcome \(y_{i}\) if \(i\) belongs to the control (“\(y_{0i}\)”), when in reality \(i\) belongs to the treated (\(D_{i}=1\)). Write as \(y_{0i}|D_{i}=1\).
CF for \(y_{i}|D_{i}=0\): An outcome \(y_{i}\) if \(i\) belongs to the treated (“\(y_{1i}\)”), when in reality \(i\) belongs to the control (\(D_{i}=0\)). Write as \(y_{1i}|D_{i}=0\).
何も仮定しないと、個人\(i\)の政策効果は計算できない
しかし、政策裨益をランダム化をすると、政策の平均治療効果average treatment effect (ATE)は推計できるBut under treatment randomisation, we can estimate the average causal impacts of a policy, the average treatment effect (ATE). \[
ATE=\E[y_{i}|D_{i}=1]-\E[y_{i}|D_{i}=0].
\]
個人\(i\)のではなく、個人\(i\)の属する母集団の平均的な治療効果
\(\E\)は母集団全体で平均を取っていることを示す期待値記号\(\E\) is an expectation operator that indcates we are taking the mean over the entire population that \(i\) belongs to
以下の思考実験を考えるConsider the following thought experiment.
多数\(n\)の個人に対し、ランダムに治療状態\(D_{i}\)をを割り当てる Suppose there are a large number \(n\) of individuals and we randomly assign the treatment status \(D_{i}\) to everyone \(i=1,\cdots, n\).
ランダム化がうまくできたとする(公平なコインを使うなど). Assume the randomisation was done well (i.e., based on “a fair coin toss”)
両グループの\(y_{i}\)の分布は似通うはず。分布が近似していれば平均値も近似する。極限を取って\(n\rightarrow\infty\)(\(n\)が無限大の場合)、両グループの分布は同一、平均値も同じになる。The distribution of \(y_{i}\) of each group should look very similar, or in the limit where \(n\rightarrow\infty\), they are identical. If the distributions are very similar, then their means are also very similar. Write the mean in the limit as \(a\).
政策がないとき、結果指標の平均値は\(a\)、政策効果は全員に同じの\(b\)だとする Suppose further that, in the absence of treatment, the mean of outcome is \(a\), and the policy impact is the same for everyone (“homogeneous” impact) \(b\).
ATEを(一致推計量consistent estimator [標本サイズが無限大になると真の値になる推計量]として)得る条件 Conditions that make the ATE estimate consistent are
政策前に、統御群と治療群の(\(y_{i}\))分布が近似していることDistributions of \(y_{i}\) of the control and the treated are very similar in the absence of a policy
インパクトはすべての\(i\)で同じImpact is homogenous across \(i\)
2.は単純化のために利用。グループごとにインパクトが違うなら、グループをもっと細かく分ければいい。2. is used for simplification. If the impact is different across subgroups, we can use finer grouping.
1.が最も重要。ランダムに割り振ることによって、各グループの特徴の分布が近似。1. is of most importance at this stage. It is randomisation of treatment status among individuals that gives similarity in distributions.
ATEの一致推計量を得るために、治験は患者をプラセボ(統御群)と治療群にランダムに割り振る。E.g., clinical trials use explict randomisation between the treated and the placebo to get a consistent estimate of ATE.
ランダム化確認: permutation test (並べ替え検定), randomisation test (確率化検定)
バングラデシュ最貧困層への貸付実験: 大規模貸付グループと小規模貸付グループの比較
>p1cm<>p.25cm<>p10cm< Source:& & Estimated with GUK administrative and survey data. Notes: & 1. & R’s package coin is used for baseline group mean covariates to conduct approximate permutation tests. &2. & Number of repetition is set to 100000. Step-down method is used to adjust for multiple testing of a multi-factor grouping variable. 40 are lost to flood before arm assignment.
治療対象選定をランダム化するとATEの一致推計量が得られるRandomisation of treatment status will give us a consistent ATE estimate
でも… 人々には同意するか決める権利がある。治療を断るかもしれない。 But… people have a right to choose. Choose not to get treated.
さらに… 人々は時にずるをする。統御群に割り振られても何とかして治療群として参加するかもしれない。被験者が同意事項に違反するときどうする?Further… people sometimes cheat. They will do stuffs that give them the treated status when they are assigned as the control. What if there is noncompliance?
北朝鮮のような独裁国家以外では、人々には選ぶ権利がある。被験者にグループ割り振りを強制することはできない。Except in North Korea, people have a right to choose. So we cannot force the assigned treatment status to the subjects.
実験者も完璧ではないので非同意者を必ず出してしまうAnd experimenters are never perfect, so there may be noncompliers.
われわれが計測できるのは非同意者を含むグループ平均値の差。非同意者がいるとインパクトが小さくなる。What we can measure is the mean group difference inclusive of noncompliance. Noncompliance makes estimated impacts smaller.
非同意者を含む効果推計値を治療意図に基づく効果intention-to-treat (ITT) effectという。The estimator under partial compliance is called intention-to-treat (ITT) effect, and is like a down-to-earth version of ATE.
実験室での効力efficacyではなく現場での有効性effectiveness。It is about effectiveness (impacts in the field) rather than efficacy (impacts in the lab).
ATEを推計できる研究は少ない。Few studies estimate ATE.
さまざまな効果推計量(実証研究の大半がITTかLATE)
ATE
Average treatment effects: 全個人の平均効果 \[ATE = \E[y_{i}|D_{i}=1]-\E[y_{i}|D_{i}=01].\]
統御群の平均治療効果average treatment effects on the control (ATC)\[
ATC = \E[y_{1i}|D_{i}=0]-\E[y_{0i}|D_{i}=0].
\]
ATEとATTの違い: 全員 vs. 治療群 Difference between ATE and ATT: The mean outcome difference among the treated or everyone.
ITTとATEの違い: 非同意者を含む全員 vs. 非同意者なしの全員
ATEはATTとATCの加重平均値 ATE is a weighted average of ATT and ATC. \[
ATE = b ATT + (1-b) ATC, \quad b= \frac{n_{\scriptsize{\mbox{treated}}}}{n_{\scriptsize{\mbox{control}}}+n_{\scriptsize{\mbox{treated}}}}.
\]
Gene-wealth gradient=0 when compared within stock owners/nonowners
株式所有経由で資産格差を説明するのは共通
株式所有という情報が共通しているだけで、ほかの経路を否定していない
鍵を捜す男
選抜問題Selection problem
グループの割り振り(\(D_{i}\))がランダム化していないと、ごく稀なケースを除き、政策がない場合の結果指標の分布はグループ間で異なる。When the treatment assignment (\(D_{i}\)) is not randomised, except for very rare lucky cases, the distributions of outcome measure in the absense of a policy are different between the treated and the control.
被験者=目的意識を持って参加する人間なので、参加者と不参加者は特徴が異なるThis is because we are dealing with humans who participate purposefully.
自己選抜self-selection
対象者自身による選抜。参加利益のある人は参加。Selection by potential participants. People with a positive net participation benefit choose to participate.
実施対象選抜placement selection
政策担当者による選抜。政策担当者に特定の集団を選ぶ指示・誘因があるとき、政策がないときに対象者(治療群)と非対象者(統御群)の分布が近似する保証はない。Selection by policymakers. If a policymaker is incentivised or instructed to choose a particular group, there is no guarantee that the distributions of outcome measures in the absence of a policy become similar between the treated (chosen) and the control (unchosen).
What we will learn:
Mechanism of self-selection
Bias of the naïve estimator (simple comparison between the participants and the nonparticipants)
Difference-in-differences (DID) estimator and how before-after data of both treated and control can give a consistent estimate of ATT under a mild condition
Treated group outcomes \(y^{1}_{i,t}\) before and after the policy. If there are \(n^{1}\) individuals, \(\underbrace{y^{1}_{1,t}, \dots, y^{1}_{n^{1},t}}_{\mbox{year } t}, \underbrace{y^{1}_{1,t+1}, \dots, y^{1}_{n^{1}, t+1}}_{\mbox{year } t+1}\).
Control group outcomes \(y^{0}_{i,t}\) before and after the policy. If there are \(n^{0}\) individuals, \(\underbrace{y^{0}_{1,t}, \dots, y^{0}_{n^{0},t}}_{\mbox{year } t}, \underbrace{y^{0}_{1,t+1}, \dots, y^{0}_{n^{0}, t+1}}_{\mbox{year } t+1}\).
Need individual level data, not just group level averages, to do inferences (=compute \(p\) values)
Testing a null hypothesis ← standard errors of the estimates ← variances and covariances ← individual level data
Let us denote the smaller of \(n^{1}, n^{0}\) as \(n^{min}\).
Steps:
Compute before and after means for both groups. \[
\bar{y}^{1}_{t}=\frac{y^{1}_{1,t}+ \dots + y^{1}_{n^{1},t}}{n^{1}}=\frac{\sum\limits_{i=1}^{n^{1}}y^{1}_{i,t}}{n^{1}}, \quad
\bar{y}^{0}_{t}=\frac{\sum\limits_{i=1}^{n^{0}}y^{0}_{i,t}}{n^{0}}, \quad
\bar{y}^{1}_{t+1}%=\frac{\sum\limits_{i=1}^{n^{1}}y^{1}_{i,t+1}}{n^{1}}
, \quad
\bar{y}^{0}_{t+1}%=\frac{\sum\limits_{i=1}^{n^{0}}y^{0}_{i,t+1}}{n^{0}}.
\]
実験でなく、パネル・データが(=DIDができ)ないとき、どうすれば良いのか? What can be done if we do not have panel data?
Good news: インパクト評価の範囲を狭くすれば、推計可能。 Over a narrower domain, impacts can be estimated.
Consider a poverty reduction policy that gives a subsidy to the people below the poverty line.
“BPL” card in India.
Suppose poverty line is USD 1.25 per day and this criteria is strictly enforced. So if your income is USD 1.24 per day, you get the money. If your income is USD 1.25, you don’t.
People with daily income of USD 1.24 and USD 1.25 are similar.
Estimate impacts by comparing BPL and APL near the poverty line.
The narrower focus around cutoff gives us a “matched pair” of the treated and the control, or a pseudo counterfactual.
Interpretation of estimates: Policy impacts on the subpopulation near the cutoff. It is a local impact near the cutoff, not a global impact such as ATE (or ATT, ATC).
Cutoff前後で資格が0から1に変わるため、RDD推計量をLATE推計量と表現する人もいる
Applications: Cutoffs, geographical boundaries.
Policies are full of cutoffs. So almost every policy has a chance of estimating its impacts near the cutoff.
Identifying assumption:
There is nothing other than the policy which “jumps” discretely around the cutoff point. So a jump in the outcome is attributed only to the policy.
But there is a catch: (Because we fit the line locally around the cutoff neighbourhood) It takes a large sample to use RDD estimator with the order of 10,000.
Ingenuity of Angrist and Lavy (1999): Predicited class size vs. exam scores.
Impact is more evident in smaller enrollment counts.
Average score is increasing after 60 regardless of predicted class size.
Possible reasons: Greater deviation of actual class size from predicted class size, different petagogical methods in large schools or more competition/learning among peers.
The Government of Israel still holds it.
Ingenuity of Angrist and Lavy (1999): Predicited class size vs. exam scores.
Impact is more evident in smaller enrollment counts.
Average score is increasing after 60 regardless of predicted class size.
Possible reasons: Greater deviation of actual class size from predicted class size, different petagogical methods in large schools or more competition/learning among peers.
1: In Quebec, unemployment benefits are increased once reaching the age of 30 for adults with no child. This should have disincentives to work for age 30 and older. If this is true, at around 30, work outcomes will be reduced.
Will there be a jump in employment rates at 30 to the below?
2: Being an incumbent can give an additional benefit in the next election. If this is true, at the vote share margin close to zero, an incumbent vs. non-incumbent contrast gives effects of this benefit. Most suitable data comes from the US state gubernatorial elections where there are effectively only two candidates/parties.
Will there be a jump in winning probability at zero vote margin to the above?
Akabayashi, Hideo, and Ryosuke Nakamura. 2014. “Can Small Class Policy Close the Gap? An Empirical Analysis of Class Size Effects in Japan.”The Japanese Economic Review 65 (3): 253–81. https://doi.org/https://doi.org/10.1111/jere.12017.
Angrist, Joshua D., and Victor Lavy. 1999. “Using Maimonides’ Rule to Estimate the Effect of Class Size on Scholastic Achievement.”The Quarterly Journal of Economics 114 (2): 533–75. http://ideas.repec.org/a/tpr/qjecon/v114y1999i2p533-575.html.
Barth, Daniel, Nicholas W. Papageorge, and Kevin Thom. 2020. “Genetic Endowments and Wealth Inequality.”Journal of Political Economy 128 (4): 1474–1522. https://doi.org/10.1086/705415.
Bosch, Mariano, and Norbert Schady. 2019. “The Effect of Welfare Payments on Work: Regression Discontinuity Evidence from Ecuador.”Journal of Development Economics 139: 17–27. https://doi.org/https://doi.org/10.1016/j.jdeveco.2019.01.008.
Bursztyn, Leonardo, Davide Cantoni, David Y Yang, Noam Yuchtman, and Y Jane Zhang. 2021. “Persistent Political Engagement: Social Interactions and the Dynamics of Protest Movements.”American Economic Review: Insights 3 (2): 233–50.
Fagereng, Andreas, Magne Mogstad, and Marte Rønning. 2021. “Why Do Wealthy Parents Have Wealthy Children?”Journal of Political Economy 129 (3): 703–56. https://doi.org/10.1086/712446.
Lee, David S. 2008. “Randomized Experiments from Non-Random Selection in u.s. House Elections.”Journal of Econometrics 142 (2): 675–97. https://doi.org/10.1016/j.jeconom.2007.05.004.
Lemieux, Thomas, and Kevin Milligan. 2008. “Incentive Effects of Social Assistance: A Regression Discontinuity Approach.”Journal of Econometrics 142 (2): 807–28. https://doi.org/http://dx.doi.org/10.1016/j.jeconom.2007.05.014.
Lumey, LH, and Aryeh D Stein. 1997. “In Utero Exposure to Famine and Subsequent Fertility: The Dutch Famine Birth Cohort Study.”American Journal of Public Health 87 (12): 1962–66.
McDermott, Rose, and Peter K. Hatemi. 2020. “Ethics in Field Experimentation: A Call to Establish New Standards to Protect the Public from Unwanted Manipulation and Real Harms.”Proceedings of the National Academy of Sciences 117 (48): 30014–21. https://doi.org/10.1073/pnas.2012021117.
Miyawaki, Atsushi, Takahiro Tabuchi, Yasutake Tomata, and Yusuke Tsugawa. 2020. “Association Between Participation in Government Subsidy Program for Domestic Travel and Symptoms Indicative of COVID-19 Infection.”medRxiv. Cold Spring Harbor Laboratory Press. https://doi.org/10.1101/2020.12.03.20243352.
Nilsson, J Peter. 2017. “Alcohol Availability, Prenatal Conditions, and Long-Term Economic Outcomes.”Journal of Political Economy 125 (4): 1149–1207.
Persson, Petra, and Maya Rossin-Slater. 2018. “Family Ruptures, Stress, and the Mental Health of the Next Generation.”American Economic Review 108 (4-5): 1214–52.